Modeling co-occupancy of transcription factors using chromatin features
نویسندگان
چکیده
Regulation of gene expression requires both transcription factor (TFs) and epigenetic modifications, and interplays between the two types of factors have been discovered. However study of relationships between chromatin features and TF-TF co-occupancy remains limited. Here, we revealed the relationship by first illustrating distinct profile patterns of chromatin features related to different binding events, including single TF binding and TF-TF co-occupancy of 71 TFs from five human cell lines. We further implemented statistical analyses to demonstrate the relationship by accurately predicting co-occupancy genome-widely using chromatin features including DNase I hypersensitivity, 11 histone modifications (HMs) and GC content. Remarkably, our results showed that the combination of chromatin features enables accurate predictions across the five cells. For individual chromatin features, DNase I enables high and consistent predictions. H3K27ac, H3K4me 2, H3K4me3 and H3K9ac are more reliable predictors than other HMs. Although the combination of 11 HMs achieves accurate predictions, their predictive ability varies considerably when a model obtained from one cell is applied to others, indicating relationship between HMs and TF-TF co-occupancy is cell type dependent. GC content is not a reliable predictor, but the addition of GC content to any other features enhances their predictive ability. Together, our results elucidate a strong relationship between TF-TF co-occupancy and chromatin features.
منابع مشابه
Occupancy Classification of Position Weight Matrix-Inferred Transcription Factor Binding Sites
BACKGROUND Computational prediction of Transcription Factor Binding Sites (TFBS) from sequence data alone is difficult and error-prone. Machine learning techniques utilizing additional environmental information about a predicted binding site (such as distances from the site to particular chromatin features) to determine its occupancy/functionality class show promise as methods to achieve more a...
متن کاملModeling interactions between adjacent nucleosomes improves genome-wide predictions of nucleosome occupancy
MOTIVATION Understanding the mechanisms that govern nucleosome positioning over genomes in vivo is essential for unraveling the role of chromatin organization in transcriptional regulation. Until now, models for predicting genome-wide nucleosome occupancy have assumed that the DNA associations of neighboring nucleosomes on the genome are independent. We present a new model that relaxes this ind...
متن کاملCo-occupancy by multiple cardiac transcription factors identifies transcriptional enhancers active in heart.
Identification of genomic regions that control tissue-specific gene expression is currently problematic. ChIP and high-throughput sequencing (ChIP-seq) of enhancer-associated proteins such as p300 identifies some but not all enhancers active in a tissue. Here we show that co-occupancy of a chromatin region by multiple transcription factors (TFs) identifies a distinct set of enhancers. GATA-bind...
متن کاملModeling Bias in DNase-seq Data for Improved Chromatin Occupancy Prediction
Whether or not a single gene is transcribed relies on a myriad of stochastic factors which may not be adequately described by the cell’s genome alone. Understanding the connection between the occupancy of a cell’s chromatin and the transcription of its genes would provide insight into the dynamic regulatory dependencies that control its internal transcription state, and so enhanced techniques f...
متن کاملPreferential Genome Targeting of the CBP Co-Activator by Rel and Smad Proteins in Early Drosophila melanogaster Embryos
CBP and the related p300 protein are widely used transcriptional co-activators in metazoans that interact with multiple transcription factors. Whether CBP/p300 occupies the genome equally with all factors or preferentially binds together with some factors is not known. We therefore compared Drosophila melanogaster CBP (nejire) ChIP-seq peaks with regions bound by 40 different transcription fact...
متن کامل